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Malaria genetic variation has been extensively characterized, but the level of epigenetic plasticity remains largely un- 
explored. Here we provide a comprehensive characterization of transcriptional variation in the most lethal malaria 
parasite, Plasmodium falciparum, based on highly accurate transcriptional analysis of isogenic parasite lines grown under 
homogeneous conditions. This analysis revealed extensive transcriptional heterogeneity within genetically homogeneous 
clonal parasite populations. We show that clonally variant expression controlled at the epigenetic level is an intrinsic 
property of specific genes and gene families, the majority of which participate in host-parasite interactions, intrinsic 
transcriptional variability is not restricted to genes involved in immune evasion, but also affects genes linked to lipid 
metabolism, protein folding, erythrocyte remodeling, or transcriptional regulation, among others, indicating that epi- 
genetic variation results in both antigenic and functional variation. We observed a general association between hetero- 
chromatin marks and clonally variant expression, extending previous observations for specific genes to essentially all 
variantly expressed gene families. These results suggest that phenotypic variation of functionally unrelated P. falciparum 
gene families is mediated by a common mechanism based on reversible formation of H3K9me3-based heterochromatin. in 
changing environments, diversity confers fitness to a population. Our results support the idea that P. falciparum uses a bet- 
hedging strategy, as an alternative to directed transcriptional responses, to adapt to common fluctuations in its envi- 
ronment. Consistent with this idea, we found that transcriptionally different isogenic parasite lines markedly differed in 
their survival to heat-shock mimicking febrile episodes and adapted to periodic heat-shock with a pattern consistent with 
natural selection of pre-existing parasites. 



[Supplemental material is available for this article.] 

Adaptation to environmental changes is essential for survival. In 
the case of Plasmodium falciparum asexual blood stages, which are 
responsible for all clinical malaria symptoms and complications, 
the intra-erythrocytic niche provides a relatively stable environ- 
ment compared with the environments where free-living unicel- 
lular organisms reside, but there are still challenges that the parasite 
must confront (Mackinnon and Marsh 2010). Among the condi- 
tions that can change during the course of a blood infection or 
between different human hosts are immune pressure, drug pres- 
sure, host metabolic and nutritional conditions, presence and mag- 
nitude of febrile episodes, host genetics, and presence of competing 
parasites. 

Long-term, slow adaptation typically occurs at the genetic 
level. In P. falciparum, the genetic bases of adaptation have been 
extensively studied (Mu et al. 2010; Ochola et al. 2010; Van Tyne 
et al. 2011). However, adaptation to fast-fluctuating environmen- 
tal pressures requires a much faster time scale and reversibility and 
operates at the transcriptional level. In the majority of organisms, 
including bacteria, yeast, and higher eukaryotes, rapid adaptation 
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typically occurs through directed transcriptional responses: An 
external condition is sensed, which triggers a signaling cascade 
that results in transcription factor-driven activation or repression 
of specific genes that mediate adaptation to the new condition. 
Some evidence suggests that P. falciparum may have a limited ca- 
pacity to mount directed transcriptional responses: (1) There is a 
marked paucity of specific transcription factors in the P. falciparum 
genome (Coulson et al. 2004), despite the recent identification of 
an expanded family of AP2 domain-containing transcription fac- 
tors (ApiAP2) (Balaji et al. 2005; Campbell et al. 2010). The specific 
transcription factors identified seem insufficient to mount a wide 
range of transcriptional responses. (2) Analysis of transcriptional 
responses immediately or very soon after environmental pertur- 
bations failed to detect large alterations in the levels of transcripts 
related with the challenge, in contrast to similar studies in other 
eukaryotes (Gunasekera et al. 2007; Ganesan et al. 2008; Le Roch 
et al. 2008; Young et al. 2008). This led some investigators to 
conclude that P. falciparum has an unresponsive, hard-wired tran- 
scriptome (Ganesan et al. 2008; Le Roch et al. 2008). This is con- 
troversial because other studies detected changes in transcription 
upon various challenges (Functional Genomics Workshop Group 
et al. 2007; Oakley et al. 2007; Natalang et al. 2008; Tamez et al. 
2008; Hu et al. 2010), but the changes observed were of low am- 
plitude and could be explained by cell cycle arrest or sexual dif- 
ferentiation, or may represent pathways leading to parasite death. 
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rather than being authentic protective transcriptional responses 
(Functional Genomics Workshop Group et al. 2007; Natalang et al. 
2008). 

Regardless of the ability of P. falciparum to mount directed 
transcriptional responses, one source of transcriptional variability 
in malaria parasites is clonally variant gene expression. Variantly 
expressed genes can be found in different transcriptional states in 
genetically identical parasites at the same stage of life cycle pro- 
gression and grown under homogeneous conditions. The tran- 
scriptional state is clonally transmitted by epigenetic mechanisms 
(Scherf et al. 2008). Clonally variant gene expression has a clear 
adaptative potential, because spontaneous changes in expression 
lead to transcriptionally heterogeneous populations. It is well es- 
tablished in the population biology field that heterogeneous pop- 
ulations are fitter than homogeneous populations in changing 
environments, because diversity provides the grounds for natural 
selection of individuals with phenotypes that confer maximum 
fitness upon an environmental change (Kussell and Leibler 2005; 
Veening et al. 2008). Clonally variant gene expression in P. falciparum 
was initially described for var genes (Scherf et al. 2008), a family of 
about 60 genes encoding PfEMP-1, which plays a major role in 
pathogenesis and immune evasion. Other P. falciparum gene fam- 
ilies also show clonally variant expression, including gene families 
encoding proteins exported to the erythrocyte surface, and fami- 
lies linked to erythrocyte invasion. Variant expression of these 
gene families may also play a role in immune evasion (Cortes et al. 
2007; Lavazec et al. 2007; Petter et al. 2007; Cortes 2008; Scherf 
et al. 2008; Gomez-Escobar et al. 2010). 

Adaptation through clonally variant gene expression would 
imply that parasite populations spontaneously show transcrip- 
tional diversity for genes relevant for adaptation to common en- 
vironmental pressures. While genetic variation in P. falciparum has 
been extensively characterized (Jeff ares et al. 2007; Mu et al. 2007; 
Volkman et al. 2007; Van Tyne et al. 2011), the genome-wide ex- 
tent of clonally variant gene expression remains unknown. Some 
studies have compared global blood stages transcription between 
different P. falciparum lines, but their sample size and experimental 
design resulted in the identification of only a small number of 
variantly expressed genes: One study compared two isogenic lines 
selected to display different antigenic and adhesive properties 
(Mok et al. 2007), whereas another study compared the tran- 
scriptome of three genetically different parasite lines, focusing on 
the comparison of patterns of expression along the life cycle rather 
than comparing transcript levels (Llinas et al. 2006). Genome-wide 
transcriptional differences have also been analyzed in field isolates 
(Daily et al. 2007; Lemieux et al. 2009; Mackinnon et al. 2009), but 
these studies suffer intrinsic limitations such as the inability to 
distinguish between spontaneous variation and selection, and the 
common occurrence of mixed infections. Furthermore, some of 
the transcript-level differences observed may be attributable to 
genetic polymorphism. A recent study established that transcrip- 
tional diversity in P. falciparum has a strong genetic component 
(Gonzales et al. 2008). 

Here we comprehensively characterized the asexual blood- 
stage transcriptome of 21 parasite lines and compared transcript 
levels within sets of parasite lines sharing a common clonal origin. 
With this approach, transcriptional variation was studied in- 
dependently of the genetic component. We identified the majority 
of inherently variant gene families and observed extensive tran- 
scriptional heterogeneity among essentially isogenic parasites. 
Variantly expressed gene families participate in multiple aspects of 
host-parasite interactions, including but not limited to immune 



evasion. The characteristics of the genes showing clonally variant 
expression are suggestive of a central role for variant expression in 
adaptation, and this was experimentally demonstrated for adap- 
tation to periodic heat-shock mimicking malaria cyclical fever. We 
also show that functionally diverse variantly expressed gene fam- 
ilies are regulated by a common epigenetic mechanism involving 
heterochromatin formation. 

Results 

Identification of genes under clonally variant expression 

Transcriptional variability was first assessed for two stocks of the 
clonal parasite line 3D 7 maintained in different laboratories for 
several years, 3D7-A and 3D7-B, and three subclones of 3D7-A (Fig. 
lA, 3D 7). Subclones were cultured for the minimal number of 
generations necessary to obtain sufficient material for transcrip- 
tional analysis. For genes with stable clonally transmitted expres- 
sion patterns, this is akin to characterizing transcript levels in 
single parasites within the clonal population from which the 
subclones were derived (Cortes et al. 2007). The five parasite lines 
were tightly synchronized to a 0- to 5-h window, and RNA was 
obtained at seven time points along the asexual blood cycle. Relative 
transcript levels were determined by hybridizing samples against 
a common reference pool in a long-oligonucleotides microarray (Hu 
et al. 2007). Principal component analysis (PC A) revealed clustering 
of samples collected at the same time point, as expected for well- 
synchronized samples. Samples collected at the latest time points 
approached samples collected at the first time point, reflecting the 
cyclic nature of asexual blood growth (Fig. IB). 

The majority of P. falciparum genes show stage-specific ex- 
pression, and even small differences in the stage of cycle pro- 
gression can lead to large differences in observed transcript levels 
(Bozdech et al. 2003). This is a major confounder for microarray- 
based studies comparing expression between different parasite 
lines. To account for this, we estimated the stage of the cycle for 
each microarray sample using a statistical likelihood-based method 
(Supplemental Fig. SI; Lemieux et al. 2009) and plotted expression 
levels against estimated times post-invasion (Fig. 1C,D). To quan- 
tify, for each gene, the level of transcriptional variability among 
the parasite lines compared, we developed a score termed aMAFC 
(adjusted maximum average fold-change across half the time in- 
terval compared), which is an estimate of the expression fold- 
change between the parasite line that expresses the gene at the 
highest level and the parasite line that expresses it at the lowest 
level (see Methods). Only —5% of the genes showed substantial 
transcriptional variation (Fig. IE). The median aMAFC corresponds 
to the first gene in Figure IC. 

We generated a high-confidence list of genes showing different 
transcript levels among 3D7-derived parasite lines, based on criteria 
dependent on the aMAFC and permutation tests (see Methods). 
Comparative genome hybridization (CGH) analysis revealed that 
in six of the 144 genes selected, different transcript levels could be 
explained by genetic differences (copy number variations, CNVs). 
Five of these six genes fell within a duplication of —56 kb in 3D7-B, 
which resulted in approximately twofold transcript levels relative 
to parasite lines without the duplication (Fig. IF). After excluding 
these genes to focus only on epigenetic transcriptional variation, 
we generated a list of 138 genes that we called the "3D7 variantome" 
(Supplemental Table SI), defined as the set of genes with clonally 
variant expression in 3D7. Some of these genes (76 of 138) belong 
to the large hypervariable families var, rif stevor, and pfmc-2tm (Sup- 
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Figure 1 . Identification of genes under clonally variant expression. (A) Parasite lines used in this study. Parental parasite lines 3D7, 7GS, D1 0, and HB3A 
are of clonal origin. 3D7-A and 3D7-B are stocks of 3D7 maintained in different laboratories for some years (Cortes et al. 2004). 7G8 subclones were 
identified as self-fertilization events after passage through mosquito and splectomized chimpanzee (Hayton et al. 2008). HB3B was derived from self- 
crossing HB3A (Walliker et al. 1 987). Parentals comparison included the three parental parasite lines shown in bold (7G8, D1 0, and HB3A). (S) Principal 
component analysis (PCA) of the time-course transcriptional analysis of 3D7 parasite lines. Samples from the same parasite line are represented by the 
same symbol, whereas samples collected at the same time point are represented by the same color. (C) Time-course expression plots of representative 
genes not showing clonally variant expression. Expression levels (log2 ratio of expression relative to the reference pool) are plotted against statistically 
estimated culture age (in hours post-invasion). The first gene has the median aMAFC, whereas the other three genes are single-copy genes commonly used 
as controls (rhopH2, seryl tRNA synthetase, and ptexlSO, in this order). (D) Time-course expression plots, as in C, but for genes showing clonally variant 
expression. The genes belong to the phista, dag, acs, and /?rp families, in this order, (f) Distribution of aMAFC. Genes (x-axis) are ranked by their aMAFC in 
descending order, (f) Duplication of a region of chromosome 1 0 in 3D7-B. Transcript (blue) and gDNA (red) levels in 3D7-B relative to 3D7-A (log2 ratio) 
are shown for the second half of chromosome 1 0. The names of the first and last genes duplicated in 3D7-B are shown. Time-course expression plots for 
two of the genes within the duplication are shown. 
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plemental Figs. S2, S3; Supplemental Results), for which variant 
expression has been previously described (Scherf et al. 2008). 

Variant expression in parasites of different genetic 
backgrounds 

We followed the same approach to identify genes under clonally 
variant expression in other sets of parasite lines sharing a clonal or- 
igin. Each comparison included the parental parasite line, of clonal 
origin, and several subclones (Fig. lA). We also compared transcript 
levels among the genetically different parental (original) lines 7G8, 
DIO, and HB3A (Fig. lA, parasite lines in bold) to identify the genes 
showing top differences in expression among genetically different 
parasites. In total, we identified genes under variant expression based 
on five comparisons, four between parasite lines sharing a clonal 
origin (3D 7, 7G8, DIO, and HB3 comparisons) and one between 
genetically different parasites (parentals comparison) (Fig. lA). 

For each non-3D7 genetic background, probes that target 
polymorphic sequences as inferred from BLAST and CGH analysis 
were excluded (Supplemental Methods). All genes from the highly 
polymorphic gene families var, rif, stevor, and pfmc-2tm were ex- 
cluded from the analysis of non-3 D 7 parasite lines, because CGH 
and BLAST analysis revealed that this microarray (designed against 
the 3D 7 reference genome) is unsuitable to study expression of 



these families in non-3 D 7 parasite lines (Supplemental Fig. S4). 
However, it is already well established from our data with 3D 7 
parasites and by others (Scherf et al. 2008) that these gene families 
show extensive clonally variant expression. Furthermore, as in the 
case of the 3D 7 comparison, we excluded from the lists of variant 
genes all the cases in which deletions, duplications, or other ge- 
netic alterations (as determined by CGH of individual parasite 
lines) may account for the differences in transcript levels observed. 
Genes showing variant expression based only on LDIO (Fig. lA) 
were also excluded from the list of 7G8 variant genes because this 
subclone has DNA sequences of non-7G8 origin (see Methods). 
Even after excluding all the genes where different transcript levels 
may be attributable to genetic differences and further filtering the 
lists of variant genes (Supplemental Figs. S5, S6, S7; Supplemental 
Table SI; Supplemental Results), we identified a substantial num- 
ber of variant genes in all comparisons (Fig. 2; Supplemental Tables 
SI, S2). No two single parasite lines analyzed were transcriptionally 
identical. Transcriptional differences between isogenic subclones 
indicate that all of the parental parasite lines from which they were 
derived, of clonal origin but grown for many generations after 
cloning, consist of a mixture of single parasites with different tran- 
scriptional patterns, despite growing under homogeneous condi- 
tions. These results demonstrate that genetically homogeneous 
clonal populations show high levels of spontaneous epigenetic 
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Figure 2. Overview of genes showing transcriptional variability. Expression patterns are shown for the clonally variant genes identified in the 3D7 {A), 
7G8 (S), DIO (C), HB3 (D), and parental lines (f) comparisons. The numbers in parenthesis indicate the number of variant genes in each comparison. 
Genes from the var, rif, stevor, and pfmc-2tm families are not included in the 3D7 heatmap, to make it comparable with the other parasite lines. Values are 
the log2 of the expression fold-change relative to the average expression in the parasite lines within each comparison. (Black dots) Genes showing the same 
transcriptional patterns in qPCR analysis of independent biological preparations; (red dots) genes that showed different patterns (Supplemental Fig. S8; 
Supplemental Table S3). 
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variation. Importantly, qPCR analysis on cDNA from independent 
parasite preparations revealed that transcriptional patterns were 
stable for the majority of genes tested (dots in Fig. 2; Supplemental 
Fig. S8); implying stable epigenetic heritability, with expression 
switches occurring at low frequency. 

Overlap in the variant genes identified 
in the different comparisons 

There was extensive overlap in the genes that showed variant ex- 
pression in the different comparisons (Fig. 3A), indicating that var- 
iant expression is an intrinsic property of certain genes. Eight genes 
showed variant expression in four or five of the comparisons. More 
than 40% of the genes that showed variant expression in the 3D 7 
comparison also showed variant expression in at least one other 
comparison, and the same was true for a similar or higher percentage 
of the variant genes identified in each comparison (Fig. 3B). Re- 
markably transcriptional variability between genetically different 
parasites seems to affect similar genes as transcriptional variability 
within clonal parasites populations: 70% of the genes more differ- 
entially expressed between the genetically different 7GS, DIG, and 
HB3 parental parasite lines (parentals comparison) were also var- 
iantly expressed in at least one of the comparisons involving isogenic 
parasite lines. The cumulative number of variant genes identified 
increased with the number of comparisons (Fig. 3C). In total, 176 
different variant genes were identified (excluding large gene families 
that were only analyzed in the 3D 7 comparison; 252 variant genes 
were identified including these families) (Supplemental Table S2). 

The overlap between the gene families with variant expres- 
sion in the different comparisons was even more extensive (Fig. 
3D), and the cumulative number of variant gene families identified 
clearly approached a plateau (Fig. 3E). This result indicates that we 
may have captured the majority of P. falciparum gene families un- 
der clonally variant expression. 

Gene families under clonally variant expression 

Many of the variantly expressed genes (1 73 out of 252) are members 
of gene families, the majority of them known or predicted to par- 
ticipate in host-parasite interactions. In total, we identified 28 gene 
families with at least two members listed as variantly expressed, or 
one member listed as variant in two comparisons (Table 1). Even 
after excluding the large variant gene families var, rif, stevor, and 
pfmc-2tm from the analysis, 97 of 176 variant genes were members 
of gene families, and 40% had paralogs according to PlasmoDB/ 
OrthoMCL, compared with only 9% in the full genome (two-tailed 
Fischer's exact test, P < 0.001). Functional redundancy in some gene 
families probably results in permissivity for variant expression. 

Control gene sets encoding single-copy, essential genes showed 
minimal levels of transcriptional heterogeneity in all sets of para- 
site lines compared (Fig. 4). In contrast, we identified several gene 
families that showed extensive transcriptional variability in all five 
comparisons, and gene families such as apiAP2 or lysophospholi- 
pases that showed stable expression levels for the majority of 
genes, with substantial variant expression restricted to specific 
members (Fig. 4). These results indicate again that variant expres- 
sion is an intrinsic property of some genes and gene families. 

Many of the variant gene families are subtelomeric and en- 
code proteins exported to the erythrocyte (Table 1), including 
exported HSP40-type cochaperones (dnaj) (Sargeant et al. 2006), 
predicted exported protein kinases (fikk) (Nunes et al. 2007), and 
exported proteins of unknown function (such asphist, hyp, and gbp 
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Figure 3. Overlap in the variantly expressed genes identified in the dif- 
ferent comparisons. (A) Genes (in rows) that showed variant expression in 
the 3D7, 7G8, DIG, HB3, or parental comparisons are shown in yellow in 
the corresponding lane. Genes from the var, rif, stevor, and pfmc-2tm 
families were excluded from this analysis, because they were only analyzed 
in the 3D7 comparison. (6) Percentage of variant genes identified in each 
comparison that showed variant expression in at least one other compari- 
son. (C) Cumulative number of variantly expressed genes identified with 
increasing number of comparisons. Values are the average, with 95% ci, of 
all possible combinations of comparisons. (D,f) Same as panels /4 and C, but 
at the gene family level. Gene families are defined in Supplemental Table S4. 

families) (Sargeant et al. 2006). Additionally, variant expression 
was observed in gene families that encode proteins that participate 
in lipid metabolism, such as acyl-CoA synthetases (acs), acyl-CoA 
binding proteins (acbp) (Bethke etal. 2006), and lysophospholipases; 
proteins involved in nutrient import and possibly in erythrocyte 
invasion (dag) (Cortes 2008; Nguitragool et al. 2011); transcription 
factors (apiAPZ) (Campbell et al. 2010); proteins involved in trans- 
mission {6-cys) (Templeton and Kaslow 1999); secreted proteins 
with functions in hemozoin formation and with procoagulant ac- 
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Table 1. P. falciparum gene families showing clonally variant expression 



Variant 



HP1/ 



Gene families 




h 

expression 


H3K9me3*' 


Subtelomeric*^ 


Exported^ 


6-cys 


10 


3 


+/- 


- 


- 


apiAP2 


29 


2 


+/- 


- 


- 


ciag 


5 


3 


+ 


+ 


- 


etramp 


14 


4 


- 


+ 


- 


PFD0075-like 


4 


3 


+ 


+ 


- 


rRNA 


9 


2 


+/- 


- 


- 


tRNA 


53 


20 


+/- 


- 


- 


acbp 


4 


1 


+/- 


+ 


+/- 


acs 


12 


4 


+ 


+ 


+/- 


lysophospholipase 


15 


1 


+/- 


+ 


+/- 


fikk 


21 


2 


+/- 


+ 


+ 


gbp 


3 


2 


+ 


+ 


+ 


hrp 


1 


1 


- 


+ 


+ 


hyp4 


1 


1 


+ 


+ 


+ 


hyp6 


2 


2 


+ 


+ 


+ 


hyp7 


3 


3 


+ 


+ 


+ 


hypIO 


2 


1 


+ 


+ 


+ 


1 ly p 1 J 


4 


2 








hyp17 


2 


2 


+ 


+ 


+ 


phista 


20 


11 


+ 


+ 


+ 


phistb 


23 


5 


+/- 


+ 


+ 


phistb_dnaj 


7 


3 


+/- 


+ 


+ 


phistc 


16 


5 


+/- 


+ 


+ 


dnajJII 


8 


3 


+ 


+ 


+ 


pfmc-2tm^ 


9 


7 


+ 


+ 


+ 


rif 


157 


26 


+ 


+ 


+ 


stevor^ 


36 


9 


+ 


+ 


+ 


var^ 


75 


34 


+ 


+ 


+ 



Only gene families with more than one member showing variant expression, or a single member 
showing variant expression in more than one comparison, are included. Gene families are described in 
Supplemental Table S4. 

^Number of genes of the family that were analyzed. 

"^Number of genes showing variant expression (included in the list of variant genes) (Supplemental 
Table S2). 

•^Gene families with >30% (+) or some, but <30% (+/-) of the members marked by HP1 or H3K9me3 
(Flueck et al. 2009; Lopez-Rubio et al. 2009; Salcedo-Amaya et al. 2009). 

^Gene families with >50% of the members located at subtelomeric regions (<150 kb from the chro- 
mosome end) are indicated as positive. 

^Gene families encoding proteins predicted to be exported to the erythrocyte (Sargeant et al. 2006) are 
indicated as positive, whereas +/- indicates gene families without a PEXEL motif that may be exported 
(Templeton 2009). 
^Analyzed only in 3D7 parasite lines. 



tivity (hrp) (Ndonwi et al. 2011); and proteins located at the para- 
site — host cell interface (etramp) (Table 1; Spielmann et al. 2003). 
Surprisingly, many tRNAs were also expressed at different levels 
between different parasite lines. Other genes were not part of any of 
the variant gene families included in Table 1, but also showed 
clonally variant gene expression. Among those, there were 14 genes 
encoding exported proteins (Sargeant et al. 2006); the gene eba-140 
(MAL13P1.60) and the pseudogenePfi^^3 (PFL2520w) from families 
linked to erythrocyte invasion (Cortes 2008); surftn4.1 (PFDOlOOc) 
(Winter et al. 2005); three genes encoding putative Zn-finger pro- 
teins; genes related with gametocyte formation, for which expres- 
sion levels in asexual parasites may determine the propensity to 
enter sexual differentiation (Supplemental Results); two more en- 
zymes linked to lipid metabolism (beta-hydroxyacyl-ACP dehy- 
dratase and dephospho-CoA kinase); and 43 genes encoding pro- 
teins annotated as ''unknown function" (Supplemental Table S2). 

Gene set enrichment analysis (GSEA) (Subramanian et al. 
2005) did not reveal any additional gene family, metabolic pathway, 
biochemical process, or protein domain associated with high levels 
of variant expression (data not shown). However, genes that have 
paralogs, lack identified orthologs in other species, are located in 



subtelomeric regions, are pseudogenes, or 
encode proteins with transmembrane do- 
mains showed a significantly skewed dis- 
tribution toward high levels of transcrip- 
tional variability (Supplemental Fig. S9A). 



Transcriptional heterogeneity enables 
adaptation to periodic Iieat-sliock 

To test the possibility that clonally vari- 
ant gene expression can mediate adapta- 
tion to environmental changes, we mea- 
sured survival of three parasite lines of the 
same genetic background (3D7-A and two 
of its subclones) (Fig. lA) after a 3-h heat- 
shock at 41.5°C, mimicking a malaria 
high-fever episode. Temperature is a ma- 
jor changing condition for malaria para- 
sites that fluctuates between stable 37°C 
in some human hosts and periodic febrile 
episodes in others (mainly young chil- 
dren or malaria-naive individuals). One 
of the recently subcloned parasite lines, 
1.2B, showed a markedly higher resis- 
tance to heat-shock (Fig. 5A), indicating 
that transcriptional heterogeneity results 
in phenotypic differences. We next sub- 
jected parasites to periodic heat-shocks 
for five consecutive generations and ob- 
served that the transcriptionally diverse 
parasite line 3D7-A rapidly adapted and 
acquired a 1.2B-like heat-shock resistant 
phenotype (Fig. 5B,C). This is consistent 
with adaptation by selection of pre-existing 
parasites with a pattern of expressed and 
repressed genes that confers low sensi- 
tivity to heat-shock. On the other hand, 
the recently subcloned (transcriptionally 
homogeneous) line lOG adapted only to 
a much lower extent and adaptation 
started later. Adapted 3D7-A parasites 
maintained the heat-shock resistant phenotype after five genera- 
tions without heat-shock (Fig. 5B,C, cycle 11), demonstrating 
stable epigenetic transmission of the transcriptional patterns that 
confer heat-shock resistance, even in the absence of heat-shock. In 
summary, the observed pattern of adaptation to heat-shock is ex- 
actly as predicted if adaptation occurs via natural selection of pre- 
existing parasites with specific transcriptional patterns, and not 
consistent with adaptation via immediate directed transcriptional 
responses (Fig. 5D). 

Epigenetic mechanisms regulating clonally variant 
gene expression 

Many of the genes that showed variant expression were found to 
be positive for the heterochromatin marks H3K9me3 and HPl in 
previous chip on ChlP-based mapping studies (Supplemental Ta- 
ble S2; Flueck et al. 2009; Lopez-Rubio et al. 2009; Salcedo-Amaya 
et al. 2009). Only two of the 28 variant gene families included in 
Table 1 were HPl- and H3K9me3 -negative. Likewise, of 35 gene 
families reported to be marked by HPl or H3K9me3, only five small 
families did not have any member showing clonally variant ex- 
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Figure 4. Distribution of variant expression in P. /b/c/paru/r? gene families. The log2 of the aMAFCforeach comparison is shown for the genes of selected 
families. Aminoacyl-tRNA-synthetases (ars) and genes encoding the basal transcription machinery (Transcr.) (Bischoff and Vaquero 2010) provide ap- 
propriate controls as gene sets encoding essential genes, for which variant expression is not expected, for phist, hyp, and dnoy families, the specific family 
identity is indicated at the right of the heatmaps. Gene families are detailed in Supplemental Table S4. The tracks at the right indicate whether a gene was 
included in the list of variant genes (Supplemental Table S2), whether it was positive for heterochromatin marks HP1 or H3K9me3 in any of the published 
reports (Flueck et al. 2009; Lopez-Rubio et al. 2009; Salcedo-Amaya et al. 2009), and whether it is subtelomerically located (<1 50 kb from a chromosome 
end). Genes that show high aMAFC in some of the tracks but are not in the list of variant genes were excluded because changes in expression were 
attributable to CNVs, or for other reasons detailed in the text. 



pression (Fig. 6A). The association between variant expression and 
heterochromatin marks at the gene family level was significant 
(two-tailed Fischer's exact test, P < 0.001). The few discrepancies are 
probably explained by all members of some families being in an 
active state in the parasite lines used for the HPl/H3K9me3 map- 
ping, or by variant genes being in the same state, either active or 
repressed, in all our parasite lines. The striking correlation between 
heterochromatin marks and variant expression at the gene family 
level was also observed at the individual gene level within some 
families such as lysophospholipase, apiAP2, achp (Fig. 4), or dag. 
The association between variant expression and heterochromatin 
marks at the gene level was highly significant by a two-tailed 
Fischer's exact test (P < 0.001), either including only genes from the 
gene families listed in Table 1 or the full genome, and also ex- 
cluding the major variant gene families var, rif, stevor, and pfmc-2tm 
from the analysis. The association was also highly significant when 
analyzed by GSEA (Supplemental Fig. S9B). These results strongly 
suggest a general mechanistic link between H3K9me3 -based het- 
erochromatin and phenotypic variation. Previous studies showed 



that variant expression of var genes and genes encoding merozoite 
proteins is mediated by reversible formation of heterochromatin, 
with H3K9me3 associated to the repressed state (Chookajorn et al. 
2007; Lopez-Rubio et al. 2007; Jiang et al. 2010; Comeaux et al. 
2011; Crowley et al. 2011). Our data suggest that this is a general 
mechanism also controlling variant expression of other functionally 
unrelated gene families, including members of non-subtelomeric 
gene families such as apiAP2. However, clonally variant genes that 
are not members of gene families were generally not marked by 
H3K9me3 or HPl and may not be regulated by heterochromatin 
formation (Supplemental Table S2). Specific transcription factors, 
such as an ApiAP2 (Campbell et al. 2010) that showed variant ex- 
pression (PFL1085W) and is HPl/H3K9me3-positive (Fig. 6C), may 
be master regulators controlling the variant expression of other 
genes. 

The stage of expression during the cycle is not coregulated 
along the chromosomes (Bozdech et al. 2003). However, our results 
indicate that, at least for gene families involved in host-parasite 
interactions, and in contrast with developmental regulation, clon- 
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Figure 5. Adaptation to heat-shock. (A) Inhibition of parasite growth by a 3-h heat-shocl< at 41 .5°C. Values are the average of three independent 
experiments, with standard deviation, and represent percentage of growth relative to identical cultures not subjected to heat-shock. Heat-shock was 
performed when parasites were at the trophozoite stage, and parasitemia was measured by FACS at the next generation. (B) Growth under periodic heat- 
shock. Parasites at the trophozoite stage were subjected to heat-shock for five consecutive generations. After each cycle, heat-shocked cultures were 
synchronized, adjusted to 1% parasitemia, and split into two identical dishes to compare again growth between heat-shock and control conditions. 
Cultures were grown without heat-shock in cycles 6-10. For details, see Supplemental Methods. (C) Percentage of growth under heat-shock relative to 
normal conditions, calculated from the data in panel B. (D) Schematic model for adaptation through directed transcriptional responses and adaptation 
through spontaneous clonally variant gene expression (bet-hedging). (Small circles) Individual genes that can be either repressed (crossed) or active 
(green arrow). In the directed transcriptional response scenario, a change in the environment results in an immediate protective transcriptional response, 
such that the external change is sensed, and specific genes are activated or repressed to mediate adaptation to the new conditions. In contrast, in the bet- 
hedging scenario, the population is transcriptionally heterogeneous as a consequence of spontaneous clonally variant gene expression. Upon a change in 
the environment, pre-existing parasites with combinations of expressed and repressed genes that confer fitness under the new conditions are selected and 
survive, whereas other parasites die (broken line). 



ally variant expression is regulated by heterochromatin formation. 
H3K9me3-based heterochromatin has the capacity to spread into 
adjacent regions (Grewal and Jia 2007), which may result in co- 
ordinated repression of neighboring genes. Many of the genes 
under variant expression cluster together in subtelomeric regions, 
but we observed little evidence for coordinated expression pat- 
terns. Different patterns of expression were often observed for 
genes within the same variant expression cluster (Fig. 6B), and 
stand-alone variant genes were observed in pairwise parasite line 
comparisons (Fig. 6C,D). Possible coordinated repression along 
a large chromosome region was only observed in a subtelomeric 
region of chromosome 14, which includes several genes involved in 
gametocyte formation (Eksi et al. 2005), but further along the 
chromosome neighboring genes showed opposite expression pat- 
terns in some parasite lines (Fig. 6E). The general lack of coordinated 
expression in clusters of neighboring genes that can be repressed by 
formation of heterochromatin indicates that subtelomeric regions 
are organized into small chromatin domains, possibly delimitated 



by barrier insulators, as previously suggested by us (Crowley et al. 
2011). This would allow independent variant expression of neigh- 
boring genes, maximizing the number of possible combinations of 
expressed and repressed genes and consequently the adaptative 
potential of clonally variant gene expression. 

Discussion 

Here we present, to our knowledge, the most comprehensive 
characterization of epigenetic variation in the malaria parasite 
P. falciparum, and we provide experimental evidence for adaptation 
of malaria parasites to changes in their environment mediated by 
clonally variant expression. Our experimental design and analysis 
resulted in unprecedented accuracy in the transcriptional com- 
parison of different parasite lines, which allowed us to distinguish 
between authentic epigenetic, clonally transmitted transcriptional 
differences, and variation attributable to genetic differences or to 
different relative abundance of parasites at different stages. This 
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Figure 6. Heterochromatin-based regulation of variant expression. (A) Correlation between variant expression and h ete roc hro matin marks at the gene 
family level. (Yellow) Gene families (described in Supplemental Table S4) with at least one member showing variant expression ("Variant" column) or 
positive for heterochromatin marks ("HP1 /H3K9me3" column) in any of the published studies (Fluecketal. 2009; Lopez-Rubio etal. 2009; Salcedo-Amaya 
et al. 2009). (Dark yellow) Gene families with only one gene showing variant expression in only one of the comparisons (gene families not included in Table 
1). (B-E) Chromosomal distribution of variantly expressed genes. (Top lane, green) The log2 of the aMAFC in the 3D7 comparison. The other lanes 
represent the relative transcript (blue) and gDNA (red) levels in pairwise parasite line comparisons, expressed as the log2 ratio of the fold-change. (Bottom 
lane, black) HP1- or H3K9me3-positive genes. 



approach led to the identification of a large number of clonally 
variant genes and gene families and demonstrated that genetically 
homogeneous parasite populations derived from a single parasite 
(clonal) are epigenetically highly heterogeneous. The extensive 
overlap in the genes and gene families that show variant expres- 
sion in comparisons involving different sets of parasite lines in- 



dicates that spontaneous clonally variant expression is an intrinsic 
property of some genes and gene families. Data showing expres- 
sion variability for each P. falciparum gene and the subclones with 
defined transcriptional patterns are important resources that will 
be available through PlasmoDB (http://www.plasmodb.org) and 
MR4 (http://www.mr4.org). 
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The critical aspects of our study design and analysis that 
allowed identification of the majority of inherently variant gene 
families were (1) transcriptional comparisons involving parasite 
lines of the same clonal origin, including recently subcloned par- 
asite lines; (2) transcriptional comparison along the full asexual 
blood cycle for a large number of parasite lines, belonging to four 
different genetic backgrounds; (3) tight synchronization of para- 
sites to a well-defined narrow age window of 0-5 h post-invasion, 
completely avoiding contaminating parasites from the previous 
generation; (4) identification of transcriptional differences based 
on expression plots with statistically estimated times, and calcu- 
lation of expression fold-changes from the area underneath the 
curve of the plots (aMAFC score); (5) exclusion of different probes 
for the analysis of parasites of different genetic backgrounds, based 
on CGH and blast analysis; and (6) CGH analysis of all parasite 
lines, to avoid the potential confounding effect of genetic alter- 
ations. This workflow provides the necessary level of accuracy to 
identify, with high confidence and sensitivity, epigenetic differ- 
ences between independent parasite lines. However, some clonally 
variant genes and gene families may have escaped identification by 
our approach (false negatives). Those may include genes with very 
fast switching rates (resulting in a mixed population for their ex- 
pression status even in the limited number of generations from 
subcloning to analysis), genes that under culture conditions are in 
the same active or repressed state in the vast majority of parasites, 
and genes that had to be excluded from the microarray analysis 
because they are highly polymorphic or have near-identical para- 
logs. Regarding this last point, only in a small number of gene 
families did the majority of genes have to be excluded from the 
analysis (Supplemental Fig. SIO). 

H3K9me3 is a hallmark of constitutive heterochromatin in 
higher eukaryotes (Trojer and Reinberg 2007), but in P. falciparum 
this mark plays a major role in facultative heterochromatin for- 
mation at loci involved in host-parasite interactions. The vast 
majority of gene families that are positive for heterochromatin 
marks also show variant expression, demonstrating a general asso- 
ciation between H3K9me3-based heterochromatin gene repression 
and phenotypic variation. This observation reveals a scenario in 
which specific regions of the largely euchromatic P. falciparum 
genome are organized into bistable chromatin domains. These 
regions can be found in two alternative clonally transmitted states, 
euchromatin or heterochromatin, which are associated with tran- 
scriptional activity or repression, respectively. Bistable chromatin 
is prone to infrequent spontaneous switches between its two 
possible states, as a consequence of molecular noise (Dodd et al. 
2007; Crowley et al. 2011). The striking correlation between vari- 
ant expression and heterochromatin marks allows us to predict 
that the majority of the genes that were positive for heterochro- 
matin marks in previous studies may be variantly expressed, im- 
plying that —10% of the genome may be under clonally variant 
expression. Considering that we observed only very few cases of 
coregulated variant expression, this represents an unlimited number 
of possible combinations of expressed and repressed genes, with the 
exception of families such as var or clag in which specific rules of 
mutually exclusive expression apply (Cortes et al. 2007; Scherf et al. 
2008). 

Our results provide a snapshot of transcriptional diversity 
within clonal parasite populations, revealing that variable ex- 
pression mainly affects gene families involved in host-parasite 
interactions. The degree of genetic diversity between isolates in 
these gene families provides insights into the physiological role of 
transcriptional variability. Gene families specifically amplified in 



the P. falciparum lineage can be roughly divided into those that 
show hypervariability between parasite isolates, such as var, rif 
stevor, dnd pfmc-2tm, and those that are conserved between isolates, 
including the majority of other amplified families (e.g., phist, dnaj, 
fikk, etramp, acs, lysophospholipases, and many others) (Jeffares 
et al. 2007; Mu et al. 2007; Volkman et al. 2007; Templeton 2009). 
Amplification of these two types of gene families was probably 
driven by immune evasion and functional diversification, respec- 
tively (Templeton 2009). We clearly observed clonally variant ex- 
pression for both types of gene families, suggesting that transcrip- 
tional variability results not only in antigenic variation but also in 
functional variation. 

This idea is further supported by the known or predicted 
function of the gene families for which transcriptional variability 
is an intrinsic property and is suggestive of a role for variant ex- 
pression in P. falciparum adaptation beyond immune evasion. Gene 
families for which we observed variant expression participate in 
metabolic processes such as lipid metabolism or nutrient import 
{acs, acbp, lysophospholipases, and clag), in protein folding/stability 
(predicted cochaperones of families dnajlll and phistb/dnaj), or in 
erythrocyte invasion (eba and possibly clag), among other func- 
tions. Expression of alternative members of these families, possi- 
bly with different functional abilities, has the potential to mediate 
adaptation to changes in host metabolic/nutritional conditions, 
changes in temperature associated to febrile episodes, or host ge- 
netic polymorphism in erythrocyte receptors, respectively, which 
are probably the major fluctuating environmental conditions that 
P. falciparum asexual stages face beyond immune pressure (Merrick 
and Duraisingh 2010). Importantly, two of the transcriptionally 
different parasite lines characterized in this study differ in their 
ability to use alternative invasion pathways or to invade eryth- 
rocytes with the cerebral malaria-protective mutation SAO (Cortes 
et al. 2004), and here we provide evidence that some of the par- 
asite lines also differ in their sensitivity to heat-shock. Also sup- 
porting the idea that variant expression can modulate fitness and 
hence permit adaptation upon challenge, switches in the ex- 
pression of clagS genes were recently shown to result in altered 
sensitivity to chemical inhibitors (Nguitragool et al. 2011) and 
may determine changes in solute permeability. Another case of 
special interest is variant expression of the expanded gene family 
acs. This gene family encodes enzymes that participate in host- 
parasite interactions by mediating activation of fatty acids scav- 
enged from the host, with different specificities predicted for 
different members of the family (Tellez et al. 2003; Bethke et al. 
2006). In summary, the large number of gene families under 
clonally variant gene expression and their characteristics indicate 
that they have the potential to mediate adaptation to the major 
fluctuating conditions encountered by P. falciparum asexual blood 
stages, clearly beyond immune evasion. 

Adaptation through clonally variant gene expression is fun- 
damentally different from adaptation through directed transcrip- 
tional responses (Fig. 5D). We propose that spontaneous, stochastic 
switches between the active and repressed states for a large number 
of genes generate transcriptional diversity within clonal parasite 
populations before any challenge is applied, thus representing a bet- 
hedging (risk-spreading) strategy (Veening et al. 2008). Upon a 
change in the environment, transcriptional heterogeneity provides 
the grounds for natural selection of pre-existing parasites with tran- 
scriptional patterns that confer maximum fitness under the new 
conditions. Diversity confers fitness to a population in changing 
environments, and mathematical models predict that diversity 
generated by stochastic variation is favored over sensing followed by 
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transcriptional responses when the environment changes infre- 
quently (Kussell and Leibler 2005), as in the case of the environment 
where P. falciparum blood stages reside. Adaptation through a sto- 
chastic phenotype-switching mechanism, controlled at the epige- 
netic level, occurs at a much faster time scale than genetic adapta- 
tion and is both heritable and reversible. Our results showing 
extensive transcriptional variation for genes involved in the in- 
teraction of the parasite with its environment provide strong sup- 
port to the idea that bet-hedging strategies play a predominant role 
in P. falciparum adaptation. This idea is further supported by our 
experiments studying adaptation to febrile temperatures, one of the 
common fluctuating conditions faced by malaria parasites. The 
observed pattern of adaptation is fully consistent with adaptation 
via selection of pre-existing parasites and epigenetic transmission of 
the transcriptional patterns that result in adaptation. While adap- 
tation via spontaneous clonally variant expression/selection and 
adaptation via directed transcriptional responses are not mutually 
exclusive, the former strategy may compensate for the reported 
limited ability of P. falciparum to mount immediate directed re- 
sponses. This is a testable hypothesis that has important implica- 
tions not only to understand how this devastating parasite adapts to 
natural fluctuating environmental conditions, but also to predict 
how it may evolve in front of renewed efforts to control or eradicate 
malaria (Mackinnon and Marsh 2010). Carefully designed future 
studies will need to identify the variantly expressed genes that de- 
termine survival under specific physiologically relevant conditions. 
Any studies aimed at identifying genes that mediate adaptation to 
specific environmental conditions should be designed and inter- 
preted considering that in P. falciparum adaptation may require a 
time scale consistent with natural selection of pre-existing epige- 
netic variants. 



Methods 

Parasites 

Clonal parasite lines 3D 7, 7G8, DIO, and HB3 have been pre- 
viously described (Anders et al. 1983; Burkot et al. 1984; Walliker 
et al. 1987). 3D7 stocks 3D7-A and 3D7-B, and 3D7-A subclones 
have also been described before (Cortes et al. 2004, 2007; Cortes 
2005). HB3A and HB3B are stocks of the same HB3 clonal parasite 
line obtained before (HB3A) or after (HB3B) mosquito and chim- 
panzee passage (Walliker et al. 1987), kindly provided by O. Kaneko 
(Nagasaki University, Japan). The 7G8 parental parasite line and 
7G8 subclones identified as 7G8 self-fertilization events (MRA- 
926, MRA-927, MRA-928, MRA-931, and MRA-933, MR4, ATCC; 
Manassas, VA) in a 7G8 X GB4 genetic cross (Hayton et al. 2008) 
were obtained from MR4 (http://www.mr4.org). Other subclones 
were generated in this study by limiting dilution. We cultured 
subclones for as few generations as possible between subcloning 
and transcriptional analysis (for subclones generated by us, typi- 
cally about 20 cycles since limiting dilution). The genetic identity 
of all parasite lines was corroborated by PCA of CGH data (Sup- 
plemental Fig. SI 1). Sequencing of a highly polymorphic region of 
the gene ama-1 (nucleotides 450-930, non-3D7 parasite lines) or 
msp2 restriction fragment length polymorphism (RFLP) (Cortes 
et al. 2007) confirmed the genetic identity of all parasite lines ex- 
cept for the 7G8 subclone LDIO. This subclone had been identified 
as a 7G8 self-fertilization event by microsatellite typing (Hayton 
et al. 2008), and our PCA clustered it with other 7G8-derived par- 
asite lines, but amal sequencing, performed after completing the 
study, revealed a GB4 ama-1 sequence. Consequently, genes that 
were identified as clonally variant based only on transcript levels in 



LDIO were excluded from the list of 7G8 variant genes (Supple- 
mental Table SI), because in these cases transcriptional differences 
may be attributable to genetic differences. 

Parasites were cultured under standard conditions in media 
containing Albumax II (Invitrogen) and no human serum, in a 5% 
CO2, 3% O2, 92% N2 atmosphere. Tight synchronization was 
achieved by Percoll (3D 7 parasite lines) or magnet (all other para- 
site lines) purification of schizonts followed by sorbitol lysis 5 h 
later, to obtain a population of a defined age window of 0-5 h post- 
invasion. Parasites were split into six or seven dishes that were 
cultured undisturbed for different periods of time (10, 20, 30, 34, 
37, 40, or 43 h) before collecting in TRIzol and freezing at -80°C. 
Details of heat-shock experiments are presented in the Supple- 
mental Methods. 

Microarray experiments 

The microarray used in this study is based on long oligonucleotides 
(70 nt) designed against the PlasmoDB 4.4 release of the 3D 7 ref- 
erence P. falciparum genome (Hu et al. 2007). We reannotated the 
microarray against the 3D7 PlasmoDB 7.0 genome release, which 
resulted in changes in more than 700 probes, including removal of 
cross-reactive probes. Additional probes were excluded for the 
analysis of parasites genetically different from 3D7, because some 
probes fall within highly polymorphic regions of the genome 
(Supplemental Figs. S4, SIO, S12; Supplemental Table S5; Supple- 
mental Methods). 

RNA was purified using the TRIzol method, and cDNA was 
synthesized by reverse transcription, purified, labeled, and hy- 
bridized essentially as previously described (Bozdech et al. 2003). 
Microarray images were analyzed using GenePix Pro 6.0 (Axon 
Instruments), with careful visual inspection to refine automatic 
spot definitions and to eliminate poor quality spots. All test sam- 
ples (labeled with Cy5) were hybridized against a reference pool 
(labeled with Cy3), which consisted of a mixture of cDNA from 
3D 7- A and 3D7-B rings, trophozoites, and schizonts in equal 
amounts for samples from 3D7-derived parasite lines, or from DIO, 
7G8, and HB3 for all other samples. 3D 7 was not included in the 
parentals comparison because it was hybridized against a different 
reference pool. For CGH analysis, sonicated gDNA of test parasite 
lines was labeled with Cy5 and hybridized against 3D7-A gDNA 
labeled with Cy3. 

Data analysis 

All data were processed using Bioconductor in an R environment. 
Spots with Cy3 and Cy5 intensities lower than 1.5 -fold times the 
microarray background were eliminated. When a spot had in- 
tensity higher than 1.5 times the background for only Cy3 or Cy5, 
the other channel was assigned a value of 1.5 times its microarray 
background, to avoid extreme ratio values. After background sub- 
traction, the logz of the Cy5/Cy3 intensity ratio was normalized 
using locally estimated scatterplot smoothing (LOESS). Gene level 
log2(Cy5/Cy3) expression ratios were computed from probe level 
values using median polish. We calculated the most likely time 
post-invasion along the asexual cycle for each sample using a sta- 
tistical likelihood method (Lemieux et al. 2009), with the HB3 
transcriptome as reference (Bozdech et al. 2003). Expression values 
for each gene [log2(Cy5/Cy3)] were plotted against estimated times 
for each parasite line. For genes with values for at least five time 
points, missing value (s) were estimated using linear model pre- 
dictors. Estimated values were <1% of the total. 

To compare expression levels among a given set of parasite 
lines and to identify genes under variant expression, the area under 
the curve of the time-course plots was computed for the time in- 
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terval between the latest first time point and the earliest last time 
point among the parasite lines compared (time compared, TC). For 
each gene, areas were only calculated for parasite lines that had 
values for all time points (either experimental or estimated). For 
developmentally regulated genes that are only expressed during 
part of the asexual cycle (Bozdech et al. 2003), it is expected that 
potential differences in expression levels between parasite lines 
would be observed only at stages when the genes are expressed. To 
prevent a bias against such genes, areas were also computed for the 
following overlapping half-TC intervals: 0X-0.5X TC, 0.5X-1X TC, 
0.25X-0.75X TC, and the complementary to the latter, 0X-0.25X 
TC + 0.75 x-lx TC. The area difference (AD) for a given half-TC 
interval was the difference between the largest and smallest areas 
among the parasite lines compared. 

We calculated for each gene the maximum average fold- 
change across half-TC (MAFC). Because expression values were in 
log2 scale, MAFC was calculated as: MAFC = z^^^^^^^^^/^-^^ 
where max refers to the maximum of AD among the four half-TC 
intervals. The MAFC corresponds to the expression fold-change 
between the parasite line that expressed the gene at the highest 
level and the parasite line that expressed it at the lowest level, 
within a given comparison. Rather than corresponding to an ex- 
pression fold-change at a single time point, the MAFC is the av- 
erage expression fold-change across a period corresponding to 
approximately half an asexual cycle. Different sets of parasite lines 
compared showed different levels of background AD attributable 
to technical variability, and background AD was also different be- 
tween the four half-TC intervals (Supplemental Fig. SI 3). To account 
for this, we calculated for each gene an adjusted maximum average 
fold-change across half-TC (aMAFC) as: aMAFC = 2[«^^^(ai^-^>/o-5x tc]^ 
where max refers to the maximum of AD-B among the four half-TC 
intervals, and B is the background level of area differences. B was 
estimated as the 80th percentile of AD for each comparison and 
each half-TC interval (calculated excluding var, rif, stevor, and 
pfmc-2tm genes, which were eliminated for the analysis of 
non-3D7 parasite lines). The 80th percentile was selected based on 
the assumption that in all comparisons <20% of the genes have 
authentic variant expression. Because the 80th percentile reflects 
the high end of area differences that may be attributable to 
experimental variability, and because the aMAFC is an average- 
expression fold-change across a long time interval rather than 
a maximum-expression fold-change at a single time point, the 
aMAFC is a conservative estimate of the expression fold-change for 
any given gene. Comparison of the distribution of aMAFC with 
non-adjusted maximum average fold-changes (MAFC) demon- 
strates that the adjustment corrects for the different levels of 
technical variability in the different comparisons (Supplemental 
Fig. S13). 

We set up empirical criteria to produce lists of genes showing 
variant transcript levels in each comparison. These criteria were 
based on the aMAFC and permutation tests with 10,000 permu- 
tations of parasite line labels within each time point, not allowing 
two consecutive labels from the same parasite line. The P- values, 
which correspond to the probability of a random permutation 
generating a higher value than the observed AD, were corrected for 
multiple testing using the Benjamini and Hochberg method 
(Benjamini and Hochberg 1995). For the comparison of sets of 
parasite lines with the same clonal origin, all genes with aMAFC 
higher than 2.5 were included in the lists of variantly expressed 
genes. In all comparisons, an aMAFC of 2.5 was higher than the 
average plus 3 standard deviations of aMAFC in non-variant genes. 
Genes with aMAFC between 1.75 and 2.5 were only included in 
the lists of variant genes if the adjusted P- value of the permutation 
test was P < 0.05. Genes with aMAFC lower than 1.75 were not 
included in the lists. Finally, plots for variant genes were visually 



inspected, and genes with differences in expression based only on 
data points with low absolute Cy5 signal intensity (less than three 
times background) were removed from the lists of variantly 
expressed genes, because near-background values provide lower 
confidence. For the comparison of non-isogenic parasite lines 
(parentals comparison), which as expected showed higher levels of 
transcriptional variability (Supplemental Fig. SI 3) that may be 
explained by the effect of genetic differences on transcript levels 
(Gonzales et al. 2008) or by technical aspects, only genes with 
aMAFC > 3.5 were retained in the list of variant genes. This rep- 
resents only the top 2.1% of genes showing highest levels of 
transcriptional variability in this comparison, with observed ex- 
pression fold-changes (MAFC) above 6.75. 

We also normalized Cy5 intensity values (no ratio) using 
quantile normalization among samples collected at the same time 
point. These values provide only a semi-quantitative estimation of 
the level of expression of a gene, because different oligonucleotides 
have intrinsically different hybridization efficiencies, and in these 
arrays each gene is represented by a small number of probes (av- 
erage 1.9 probes/gene). Gene Cy5 intensity values were calculated 
as the median of values for the different probes. 

Heatmaps and hierarchical clustering were performed using 
TMEV 4.4 (Saeed et al. 2006). Chromosome plots were constructed 
using IGV (Robinson et al. 2011). GSEA analysis was conducted 
using GSEA Preranked (Subramanian et al. 2005). 

Data access 

The microarray data presented in this study have been submitted 
to Array Express under accession number E-MTAB-673. 
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